Anthropic Updates Safety Policy, Establishes 'Safety Thresholds' to Prevent AI Out of Control
In the current rapid development of artificial intelligence technology, Anthropic has recently announced an update to its 'Responsibility Expansion Policy (RSP)', which aims to effectively manage the potential risks posed by high-capability AI systems. As the company behind the popular chatbot Claude, this move by Anthropic clearly aims to find a balance between the continuously enhancing AI capabilities and the necessary safety standards. This new policy introduces what is called capability thresholds, as an additional safety assurance when AI model capabilities are increased.